New variable selection methods for zero-inflated count data with applications to the substance abuse field.

نویسندگان

  • Anne Buu
  • Norman J Johnson
  • Runze Li
  • Xianming Tan
چکیده

Zero-inflated count data are very common in health surveys. This study develops new variable selection methods for the zero-inflated Poisson regression model. Our simulations demonstrate the negative consequences which arise from the ignorance of zero-inflation. Among the competing methods, the one-step SCAD method is recommended because it has the highest specificity, sensitivity, exact fit, and lowest estimation error. The design of the simulations is based on the special features of two large national databases commonly used in the alcoholism and substance abuse field so that our findings can be easily generalized to the real settings. Applications of the methodology are demonstrated by empirical analyses on the data from a well-known alcohol study.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hurdle, Inflated Poisson and Inflated Negative Binomial Regression Models ‎ for Analysis of Count Data with Extra Zeros

In this paper‎, ‎we ‎propose ‎Hurdle regression models for analysing count responses with extra zeros‎. A method of estimating maximum likelihood is used to estimate model parameters. The application of the proposed model is presented in insurance dataset‎. In this example‎, there are many numbers of claims equal to zero is considered that clarify the application of the model with a zero-inflat...

متن کامل

Bayesian Zero- Inflated Poisson model for prognosis of demographic factors associated with using crystal meth in Tehran population

    Background: Use of methamphetamine (MA) and other stimulants has increased steadily over the past 10 years. Risk factor evaluation to reduce the problem in the community is one solution to protect people from addiction. This study aimed at using Bayesian zero- inflated Poisson (ZIP) model to investigate the relationship between the number of using crystal meth and some demogr...

متن کامل

Variable selection for zero-inflated and overdispersed data with application to health care demand in Germany.

In health services and outcome research, count outcomes are frequently encountered and often have a large proportion of zeros. The zero-inflated negative binomial (ZINB) regression model has important applications for this type of data. With many possible candidate risk factors, this paper proposes new variable selection methods for the ZINB model. We consider maximum likelihood function plus a...

متن کامل

An Overview of the New Feature Selection Methods in Finite Mixture of Regression Models

Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...

متن کامل

مقایسه مدل شبکه عصبی مصنوعی با مدلهای رگرسیونی دادههای شمارشی در پیش بینی تعداد دفعات اهدای خون

 Background: Modeling is one of the most important ways for explanation of relationship between dependent and independent response. Since data, related to number of blood donations are discrete, to explain them it is better to use discrete variable distribution like Poison or Negative binomial. This research tries to analyze numerical methods by using neural network approach and compare ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics in medicine

دوره 30 18  شماره 

صفحات  -

تاریخ انتشار 2011